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Abstract — How can we protect the network infrastructure 
from malicious traffic, such as scanning, malicious code prop- 
agation, and distributed denial-of-service (DDoS) attacks? One 
mechanism for blocking maficious traffic is filtering: access 
control lists (ACLs) can selectively block traffic based on fields of 
the IP header. Filters (ACLs) are already available in the routers 
today but are a scarce resource because they are stored in the 
expensive ternary content addressable memory (TCAM). In this 
paper, we develop, for the first time, a framework for studying 
filter selection as a resource allocation problem. Within this 
framework, we study five practical cases of source address/prefix 
filtering, which correspond to different attack scenarios and 
operator's policies. We show that filter selection optimization 
leads to novel variations of the multidimensional knapsack 
problem and we design optimal, yet computationally efficient, 
algorithms to solve them. We also evaluate our approach using 
data from Dshield.org and demonstrate that it brings significant 
benefits in practice. Our set of algorithms is a building block 
that can be immediately used by operators and manufacturers 
to block malicious traffic in a cost-efficient way. 

I. Introduction 

How can we protect our network infrastructure from ma- 
licious traffic, such as scanning, malicious code propagation, 
spam, and distributed denial-of-service (DDoS) attacks? These 
activities cause problems on a regular basis ranging from 
simple annoyance to severe financial, operational and political 
damage to companies, organizations and critical infrastructure. 
In recent years, they have increased in volume, sophistication, 
and automation, largely enabled by botnets that are used as 
the platform for launching these attacks. 

Protecting a victim (host or network) from malicious traffic 
is a hard problem that requires the coordination of sev- 
eral complementary components, including non-technical (e.g. 
business and legal) and technical solutions (at the application 
and/or network level). Filtering support from the network is 
a fundamental building block in this effort. For example, the 
victim's ISP may install filters to react to an ongoing attack, by 
blocking malicious traffic before it reaches the victim. Another 
ISP may want to proactively identify and block the malicious 
traffic before it reaches and compromises vulnerable hosts in 
the first place. In either case, filtering is a necessary operation 
that must be performed within the network. 

Filtering capabilities are already available at the routers 
today via access control lists (ACLs). ACLs allow a router to 
match a packet header against rules [1] and are currently used 
for enforcing a variety of policies, including infrastructure 
protection [2]. For the purpose of blocking malicious traffic, 
a filter is a simple ACL rule that denies access to a source 
IP address or prefix. To keep up with the high rates of 
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modem routers, it is important that filtering is implemented 
in hardware: indeed ACLs are stored in the Ternary Content 
Addressable Memory (TCAM), which allows for parallel ac- 
cess and reduces the number of lookups per forwarded packet. 
However, TCAM is more expensive and consumes more space 
and power than conventional memory. The size and cost of 
TCAM puts a limit on the number of filters and this is not 
expected to change in the near future.^ With thousands or tens 
of thousands of filters per path, an ISP alone cannot hope to 
block the currently witnessed attacks, not to mention attacks 
from multimillion-node botnets expected in the near future. 

Consider the example shown in Fig. 1(a): an attacker com- 
mands a large number of compromised hosts to send traffic 
towards a victim V (say a webserver), thus exhausting the 
resources of V and preventing it from serving its legitimate 
clients; the ISP of V tries to protect its client from the attack, 
by blocking the attack at the gateway router G. Ideally, G 
would like to assign a single filter to block each malicious 
IP source. However, there are less filters than attackers and 
aggregation is typically used: a single filter blocks an entire 
source address prefix. This has the desired effect of reducing 
the number of filters but also the side-effect of blocking 
legitimate traffic originating from that prefix. Therefore, filter 
selection becomes an optimization problem that tries to block 
as many malicious and as few legitimate sources as possible, 
given a certain budget on the number of filters. 

In this paper, we formulate, for the first time, a general 
framework for studying filter selection as a resource allocation 
problem. To the best of our knowledge, the optimal filter 
selection aspect has not been explored so far, as most related 
work on filtering has focused on protocol and architectural 
aspects. Within this framework, we consider five practical 
source address filtering problems, depending on the attack 
scenario and the operator's policy and constraints. Our con- 
tributions are twofold. On the theoretical side, filter selection 
optimization leads to novel variations of the multidimensional 
knapsack problem, and we exploit the special structure of 
each problem to design optimal and computationally efficient 
algorithms. On the practical side, we provide a set of cost- 

router linecard or supervisor-engine card typically supports a single 
TCAM chip with tens of thousands of entries. For example, the Cisco Catalyst 
4500, a mid-range switch, provides a 64,000-entry TCAM to be shared among 
all its interfaces (48- 384). Cisco 12000, a high-end router used at the Internet 
core, provides 20,000 entries that operate at line-speed per linecard (up to 
4 Gigabit Ethernet interfaces). The Catalyst 6500 switch can fit 16K-32K 
patterns and 2K-4K masks in the TCAM. Depending on how an ISP connects 
to its clients, each individual client can typically use only part of these ACLs, 
i.e. a few hundreds to a few thousands filters. 
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Fig. 1. Example of a distributed attack. Let's assume that the gateway router G has only two filters available to block malicious traffic and protect the victim 
V. It uses Fl to block a single malicious address (A) and F2 to block prefix a.b.c.^, which contains 3 malicious sources but also one legitimate source (B). 
Therefore, the selection of filter F2 trades-off the collateral damage (blocking B) for the reduction in the number of filters (from 3 to 1). 



efficient algorithms that can be used both by operators to block 
malicious traffic and by router manufacturers to optimize the 
use of their TCAM and eventually optimize the cost of the 
routers. We would like to emphasize that we do not propose a 
novel architecture for dealing with malicious traffic; instead, 
we optimize the use of an important mechanism that already 
exists on the Internet today and can be immediately used as a 
building block in larger defense systems, as discussed in detail 
in Section V-A. 

The structure of the paper is as follows. In Section II- 
A, we formulate the general framework for studying filter 
selection. In Section III, we study five specific problems 
that correspond to different attack scenarios and operator's 
policies: blocking all addresses in a blacklist (BLOCK- ALL); 
blocking some addresses in a blacklist (BLOCK- SOME); 
blocking all/some addresses in a time-varying blacklist (TIME- 
VARYING BLOCK-ALL/SOME); blocking flows during a 
DDoS flooding attack to meet bandwidth constraints (FLOOD- 
ING); and distributed filtering across several routers during 
flooding (DIST-FLOODING). For each problem, we design an 
optimal, yet computationally efficient, algorithm to solve it. In 
Section IV, we use data from Dshield.org [3] to evaluate the 
performance of our algorithms in realistic attack scenarios and 
demonstrate that they bring significant benefit in practice. In 
Section V, we position our work within (a) the bigger picture 
of defense against malicious traffic and (b) related knapsack 
problems. Section VI concludes the paper. 

II. Problem Formulation and Framework 
A. Definitions and Notation 

Let us first define the notation used throughout the paper, 
also summarized in Table I. 

Source IP addresses and prefixes. Every IPv4 address Hs a 
32-bit sequence. Using the standard notation IP/mask we use 
p/lto denote a prefix p of length / bits; p and / can take values 
/ = 0, 1, ...32 and p = 0,l,...2^ — 1 respectively. Sometimes, 
for brevity, we will write simply p to indicate prefix p/L We 
write i G p/l to indicate that address i is within the 2^^~^ 
addresses covered by prefix p/L 

Blacklists. A blacklist (BC) is a list of N unique malicious 
source IP addresses, which send malicious traffic towards the 



victim. Identifying which sources are malicious and should be 
blocked is a difficult problem on its own right, but orthogonal 
to the focus of this paper. We consider that the set of malicious 
IP sources is accurately identified by another module (e.g. 
an intrusion detection system and/or historical data) in a pre- 
processing step and is given as input to our problem. (For a 
discussion of these assumptions, see Section V-A.) 

An address is considered "bad" if it appears in a blacklist 
or "good" if it belongs to a whitelist (a set of legitimate 
addresses) Q, which may or may not be explicitly given. In 
the latter case, G includes all addresses that are not in BC. 

Address Weight. In the simplest version of the problem, an 
address is simply either bad or good, depending on whether 
it appears or not in a blacklist respectively. In a more general 
framework, a weight Wi can be assigned to every address i 
to indicate the importance of an address. We use Wi < for 
every bad address i to indicate the benefit from blocking it; we 
use Wi >0 for every good address i to indicate the collateral 
damage from blocking it; Wi = indicates indifference about 
whether address i will be blocked or not. 

The weight Wi can have different interpretation depending 
on the problem, as we will see later. First, it can capture the 
amount of bad/good traffic originating from an IP address and 
therefore the benefit/cost of blocking that address. Second, Wi 
can express policy: e.g. depending on the amount of money 
gained/lost by the ISP when blocking address i, the operator 
can decide to assign large positive weights to its important 
customers that should not be blocked, or large negative weights 
to the worst attackers that must be blocked.^ 

Filters. In this paper, we focus on source address/prefix 
filtering. A filter is a simple ACL rule that specifies that all 
addresses in prefix p/l should be blocked. Fmax denotes the 
maximum number of filters available in TCAM and is given 
as input to our problem. Notice that filter optimization is only 
meaningful when the number of available filters Fmax is much 

^The higher the absolute value of the weight assigned to an individual 
bad/good address, the higher preference to block/not block that address. If 
all good and bad addresses are assigned the same Wg and —Wfj respectively, 
then the ratio ^ is a parameter that the operator can tune to express how 
much she values low collateral damage vs. blocked malicious traffic. At the 
extreme, Wi = oo (— oo) indicates that address i must never (always) be 
blocked. 
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smaller than the number of malicious sources A^, which is 
indeed the case in practice (see introduction and [1], [2]). 

The decision variable Xp/i G {1, 0} is 1 if a filter is assigned 
to block prefix p/l; or otherwise. A filter p// blocks all 2^^~^ 
addresses in that range. This has the desired effect of blocking 
all bad traffic bp/i = \ J^iep/inBC I side-effect of 

blocking all legitimate traffic g^ji = J2iep/ing originating 
from that prefix. An effective filter should have a large benefit 
bp/ 1 and low "collateral damage" gp/i. 

B. Rationale and Overview of Filtering Problems 

Given a set of malicious and legitimate sources, and a 
measure of their importance (il^'s), the goal of filter selection is 
the construction of filtering rules, so as to minimize the impact 
of malicious sources on the network using the available net- 
work resources (e.g. filters and link capacity). Depending on 
the attack scenario, and the operator's policy and constraints, 
different problems may arise. E.g. the operator might want to 
block all malicious sources, or might tolerate to leave some 
unblocked; the attack might be of a low rate or a flooding 
attack; the operator may control one or several routers. 

In the core of each filtering problem lies the following: 

min^ ^i'^p/i (1) 

p/l iEp/l 

'^p/l — F^max (2) 



S.t. 
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p/l 

p/hi^p/l 



< 1 vi G s/: 



^p//e{0,l} V/ = 0,.. 32,^ = 0,. .2^ 



(3) 



(4) 



Eq.(l) expresses the objective to minimize the total cost for 
the network, which consists of two parts: the collateral damage 
(terms with Wi > 0) and the cost of leaving malicious traffic 
unblocked (terms with Wi < 0). We use the notation ^p/i to 
denote summation over all possible prefixes p/l: I = 0, ...32, 
p = 0, ...2^ — 1. Eq.(2) expresses the constraint on the number 
of filters. Eq.(3) states that overlapping filters are mutually 
exclusive, i.e. each malicious address should be blocked at 
most once, otherwise filtering resources are wasted. Eq.(4) 



lists the decision variables Xp/i corresponding to all possible 
prefixes; it is part of every optimization problem in this paper 
and will be omitted from now on for brevity. 

Eq.(l)-(4) provide the general framework for filter selection 
optimization. Different filtering problems can be written as 
special cases within this framework, possibly with additional 
constraints. As we discuss in Section V-B, these are all multi- 
dimensional knapsack problems [4], which are in general, NP- 
hard. The specifics of each problem affect dramatically the 
complexity, which can vary from linear to NP-hard. 

In this paper, we formulate five practical filtering problems, 
and we develop optimal, yet computationally efficient algo- 
rithms to solve them. Here, we summarize the rationale behind 
each problem and our main results. The exact formulation and 
detailed solution for each problem is provided in section III. 

[Pi] BLOCK-ALL: Assume that a blackhst BC and a 
whitelist G is given; a weight is also associated with every 
good address to indicate the amount of legitimate traffic 
originating from that address. The limit on the number of 
filters is Fmax- The first practical goal an operator may have 
is to choose a set of filters that block all malicious sources so 
as to minimize the collateral damage. We design an optimal 
algorithm that solves this problem at low-complexity (linearly 
increasing with N, i.e. the lowest achievable complexity for 
this problem). 

[P2] BLOCK-SOME: Assume that the same blacklist and 
whitelist are given, as in Pi. However, the operator may be 
willing to block only some (instead of all) malicious addresses, 
so as to decrease the collateral damage, at the expense of 
leaving some malicious traffic unblocked. She can achieve this 
by assigning weights Wi > and Wi < to good and bad 
addresses, respectively, to express their relative "importance". 
The goal of P2 is to block only those subsets of malicious 
addresses that have the highest impact and are not co-located 
with important legitimate sources, so as to minimize the total 
cost in Eq.(l). We design an optimal, computationally efficient 
(linearly increasing with TV) algorithm for this problem too. 

[P3] TIME-VARYING BLOCK-ALL (SOME): Assume 
that a set of blacklists {BCtq , BCti , • • • , S^t^ ,..•}, and a 
set of whitelists {Qtq , Qti , • • • , Qri , • . • } are given at different 
times. To < Ti < • • • < Ti < . . . ; a weight is also 
associated with every address; the limit on the number of filters 
is Fmax- The goal of P3 is to exploit temporal correlation 
between blacklists at successive times and, given the solution 
to BLOCK- ALL(SOME) for input blackhst BCt,_,, to effi- 
ciently update the filtering rules and construct the solution to 
BLOCK- ALL(SOME) with input blackhst BC^- 

[P4] FLOODING: In a distributed flooding attack, such as 
the one shown in Fig.l, a large number of compromised hosts 
send traffic to the victim with the purpose of exhausting the 
victim's access bandwidth. The problem is well-known and 
increasingly frequent and severe. Our framework can be used 
to optimally select filters in this case, so as to minimize the 
collateral damage and meet the bandwidth constraint (i.e. the 
total bandwidth of the unblocked traffic should not exceed the 
bandwidth of the flooded link, e.g. link G-V in Fig.l). The 



input is the same as in P1-P2, and the weights capture the 
traffic volume originating from each IP source. We prove that 
the problem P4 is NP-hard and we design a pseudo-polynomial 
algorithm that optimally solve problem P4 with complexity 
that grows linearly with the number of sources in the blacklist 
and the whiteHst \BC\^\g\. 

[P5] DIST-FLOODING: All the above problems aim at 
selecting filters at a single router. However, a network ad- 
ministrator, of an ISP or campus network, may use the 
filtering resources collaboratively across several routers to 
better defend against an attack. (Distributed filtering may also 
be enabled by the cooperation across several ISPs against a 
conmion enemy.) The question then is not only which filters 
to select but also on which router to place them. Here, we 
focus on DIST-FLOODING, which is the practical case of 
distributed filtering, across several routers, against a flooding 
attack. We prove that P5 can be decomposed into several 
FLOODING problems, that can be solved independently and 
optimally one at each router. 

III. Filtering Problems and Algorithms 
In this section, we give the detailed formulation of each 
problem and the algorithm that solves it. But first, let us define 
a data structure that we use to represent the problem and to 
develop all the subsequent algorithms. 

A. Data Structure for Representing the Problem 

Definition 1 (LCP Tree): Given a set ^ of IP addresses, 
we define the Longest Common Prefix tree of A, LCP(^), 
as the binary tree whose leaves represent the N IPs and all 
other nodes represent all and only the longest common prefixes 
between any pair of IPs in A. The prefixes are organized 
in the natural IP hierarchy, with shorter prefixes towards 
the root and longer prefixes towards the leaves, so that the 
prefix corresponding to a parent node includes the prefixes 
corresponding to its two children. 

An example is shown and discussed in Fig. 2. 

The LCP tree can be constructed from the binary tree of all 
prefixes, by removing the branches that do not have malicious 
IPs and then by removing nodes with a single child. It reduces 
the storage for representing candidate prefixes by encoding 
those prefixes that are part of a feasible solution. The LCP 
tree is a variation of the binary (unibit) trie [5] but does not 
have nodes with a single child. We do not claim novelty in 
this data structure but we describe it in detail because we use 
it extensively in the design of the algorithms. 

Complexity: We can build the LCP tree from N malicious 
addresses by performing A^ insertions in a Patricia trie [5]. 
To insert a string of m bits, we need at most m comparisons. 
Thus, the worst case complexity is 0{mN), where m = 32 
(bits) is the constant length of an IP address. 

We will make extensive use of the LCP tree in all algorithms 
in the rest of this section, as it provides a compact way to 
represent feasible solutions and to efficiently select the optimal 
one. Note that every node in the LCP-tree is a candidate prefix 
p/l; for brevity of notation, we will use interchangeably the 
notation p/l and its shorter version p. 
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Fig. 2. Example of LCP-tree used in BLOCK-ALL. For ease 
of illustration, consider a 4-bit (instead of 32-bit) address space, 
i.e. from 0000 to 1111. Let BC = {0,3,4,5,7,810,11,12} be the 
set of malicious IPs, corresponding to the leaves of the binary tree. 
All remaining IPs (1, 2, 6, 9, 13, 14, 15) are considered legitimate and 
not explicitly shown. Every intermediate node represents the longest 
common prefix (LCP) covering all malicious sources in that subtree; 
it is associated with a cost measuring the additional collateral damage 
caused when we filtering that node, instead of filtering each of its 
children. E.g. the LCP of malicious addresses 0=0000 and 3=0011 is 
prefix 00**; if filter 00** is chosen instead of filters 0000 and 0011, 
collateral damage of 2 is caused, because the legitimate addresses 1 
and 2 are also blocked. Choosing a set of source prefixes to filter is 
equivalent to choosing a set of nodes in this LCP tree. E.g. a feasible 
solution to BLOCK-ALL consists of prefixes {0/2,4/2,8/2, 12/4} 
that cover all malicious IPs. 

B. BLOCK-ALL 

Goal. Given: (i) a blacklist of malicious addresses BC 
(ii) a set of legitimate sources (iii) weights assigned to each 
legitimate source address, indicating the amount of traffic from 
that address and (iv) a limit on the number of filters Fmax\ 
select source address prefixes so as to block all malicious 
sources and minimize the collateral damage. 

Formulation. This can be formulated within the general 
framework of Eq.(l)-(4) by assigning wi > to good 
addresses (the amount of legitimate traffic) and weight = 
to each malicious source. The goal is to minimize the total 
cost, which in this case is simply the total legitimate traffic 

blocked: Y.p/iY.iep/i^i ' ^vli = Eiep/ing^i + = 9p/i- 
Constraint Eq.(7) enforces that every malicious source should 
be blocked by exactly one filter. 

"ocim^gp/iXp/i (5) 
p/i 

s.t. ^^Xp/i < 

Pfnax (6) 

p/i 

xp/i = l yieBC (7) 

p/l:iep/l 

Characterizing an Optimal Solution. In the algorithm, we 
search for solutions that can be represented as a subtree of the 
LCP tree structure, as described in the following: 



Proposition 3.1: Given BC and F^ax^ there exists an op- 
timal solution of BLOCK-ALL that can be represented as a 
pruned subtree of LCP-tree(BI/) with: the same root, up to 
Fmax leaves, and non-leaf nodes having exactly two children. 

Proof: We prove that every feasible solution of BLOCK- 
ALL can be reduced to another feasible solution that (i) 
corresponds to a subtree of LCP-tree(S>C) as described in the 
proposition and (ii) has smaller or equal collateral damage. 
This is sufficient to prove the Prop. 3.1 since an optimal 
solution is also a feasible one. 

Clearly, every feasible solution of Eq. (5)-(7), S, can be 
represented as a pruned subtree of the binary tree of all 
possible IP prefixes, with the same root and leaves being the 
prefixes used as filters. Assume that S uses a prefix p/l which 
is not in LCP-tree(i3>C). Therefore, either p/l does not contain 
any bad IPs or one of its two branches does not. In fact, if 
this was not the case, i.e. there is at least one bad IP in both 
branches, then p/l would be the longest common prefix of 
them, and as such it would be in LCP-tree(S>C). 

If there are no bad IPs in prefix p/l, then we can safely 
remove the filter p/l, as it is not blocking any bad IPs. 
Similarly, if bad IPs are concentrated only in one of the two 
branches, then we can move the filter from p/l to its child 
that contains all bad IP(s). 

In both cases, we have a constructed a new feasible solution, 
with smaller (or equal) collateral damage than the original 
solution. Iterating this process until all prefixes are in the LCP- 
tree shows that any feasible solution can be transformed in a 
feasible solution corresponding to a subtree of LCP-tree(S>C), 
as described in the proposition and having smaller or equal 
collateral damage. Therefore, also an optimal feasible solution 
can be transformed to that form. 

Finally, we note that every node of the subtree so con- 
structed, has two (or zero) children node. By contradiction, 
a set of filters which can be represented as a subtree of the 
LCP-tree with (at least) one node p with exactly one child 
node, correspond to leaving unfiltered all bad IPs contained in 
the child node (prefix) of p which is not selected in the subtree. 
^ This violates constraint in Eq.(7), and thus correspond to a 
non-feasible solution of problem BLOCK-ALL. ■ 

Algorithm. Algorithm 1, which solves BLOCK- ALL, con- 
sists of two main steps. First, we build the LCP-tree from the 
input blacklist. Second, in a bottom-up fashion, we compute 
Zp{F)\fp, F, i.e. the minimum collateral damage needed to 
block all malicious IPs in the subtree of prefix p using 
at most F filters. Following a dynamic programming (DP) 
formulation, we can find the optimal allocation of filters in the 
subtree rooted at prefix p, by finding a value n and assigning 
F — n filters to the left subtree and n to the right subtree, so 
as to minimize the collateral damage. The fact that we need to 
filter all malicious addresses (leaves in the LCP tree) implies 
that at least one filter must be assigned to the left and right 
subtree, i.e. n= 1, 2..., F — 1. 

^note that in the LCP-tree every node/prefix contain at least one bad IP. 



Algorithm 1 Algorithm for BLOCK-ALL ^ 

1: build LCP-tree(i3£) 

2: for all leaf nodes leaf do 

3: Zieaf(F)=OyF e [l,Fmax] 

5 Pmax] 

5: end for 

6: level = level(leaf)-l 

7: while level > level{root) do 

8: for all node p such that level(p)==level do 

9: zp{l)=gp 

10: Xp(l) = {p} 

11: Zp(F) = mmr^=l,..F-l {zsi(F-n) + Zs,(n)^yF e [2, 

Fmax] 

12: Xp{F) = Xsi {F-n)U Xs, (n)VF G [2, Fmax] 

13: end for 

14: level = level - 1 

15: end while 

16: Return Zroot{Fmax), Xroot{Fmax) 



For every pair of sibling nodes, si (left) and (right), with 
common parent node p, we have the DP recursive equation: 

Zp{F) = min {zs,{F - n) + ^.,(n)}, F > 1 (8) 

n=l,...,F—l L J 

with boundary conditions for leaf and intermediate nodes: 

zieaf{F)=0 VF>1, zp{l)=gp yp (9) 

Once we compute Zp{F) for all prefixes in the LCP-tree, we 
simply read the value of the optimal solution, Zroot{Fmax)- 
We also use the variables Xp{F) to keep track of the set of 
prefixes used in the optimal solution. In lines (4) and (10) of 
Algorithm 1, Xp(F) is initialized to the single prefix used. In 
line (12), after computing the new cost, the corresponding set 
of prefixes is updated: Xp{F) = Xsi{F — n) U Xs^{n). 

Theorem 3.2: Alg.l computes the optimal solution of prob- 
lem BLOCK-ALL: the prefixes that are contained in set 
Xp{F) are the optimal Xp^i = 1 for Eq.(5)-(7). 

Proof: Recall, Zroot{Fmax) denote the value of the opti- 
mal solution of BLOCK- ALL with Fj^ax filters (i.e. minimum 
amount of collateral damage), and with Xroot{Fmax) the set 
of filters selected in the optimal solution. Let si and Sr denote 
the two children nodes (prefixes) of root in the LCP-tree(B>C). 
Finding the optimal allocation of Fmax > 1 filters to block all 
IPs contained in root (possibly the all IP space), is equivalent 
to finding the optimal allocation of x > 1 filters to block all 
IPs in si, and y > I prefixes for bad IPs in Sr, such that 
X -\- y = Fmax- This is because prefixes si, and Sr jointly 
contain all bad IPs. Moreover, both si and Sr contains at least 
one bad IP. Thus, at least one filter must be assigned to each 
of them. If Fmax = 1, i.e. there is only one filter available, the 
only feasible solution is to select root as the prefix to filter out. 
The same argument recursively applies to descendant nodes, 
until either we reach a leaf node, or we have only one filter 
available. In these cases, the problem is trivially solved by 
condition in Eq.(9). ■ 

Complexity. Computing Eq.(8) for every node p and for 
every F G [1, Fmax - 1] involves N{Fmax - 1) subproblems, 
one for every pair (p, F) with complexity Fmax — 1 each. 
Zp{F) in Eq.(8) requires only the optimal solution at the 



sibling nodes, z{si,F — n), z{sr,n). Thus, proceeding from 
the leaves to the root, we can compute the optimal solution 
in N{F^ax — 1)^- This simple bound can be made tighter 
observing that, at every node in the LCP-tree we do not 
need to compute Zp{F) for all values F < F^ax, but only 
for F < mm{\leaves{p)\, Fmax}^ where \leaves{p)\ is the 
number of the leaves under prefix p in the LCP tree. Moreover, 
the complexity of computing every single entry Zp{F) is 
obviously F. Thus, the overall number of operations needed 
equals. 
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where = min{F^a^, |/ea'L'es(i)|}. Let denote the 
level of node i in the LCP-tree, with the convention that we 
assign L = to the root node. Per every node, such that Li < 
Llog(;p^)J, A^ = Fmax', Otherwise, A^ = \leaves{i)\ < 
since LCP-tree is a binary tree. Thus, we have 
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where Eq.(ll) uses the fact that if < no < ni, then 

Eni J_ ^ 1 
h^no 2^ — 2^0-1 • 

Using this observation, the computation can be done in 
0{NFmax)^ which is essentially 0{N), since F^ax « N 
and Fmax does not depend on N but only on the TCAM size. 
Thus, the time complexity increases linearly with the number 
of malicious IPs TV. This is the lowest achievable complexity, 
within a constant factor, since we need to read all TV malicious 
IPs at least once. 



C BLOCK-SOME 

Goal Given: (i) a blacklist of malicious addresses (ii) a set 
of legitimate sources (iii) weights assigned to all addresses, 
which express relative importance and (iv) a limit on the 
number of filters Fmax'', select some source address prefixes 
to block so as to minimize the total cost, including the cost 
of collateral damage and the benefit of blocking malicious 
addresses. 

Formulation. This can be formulated within the general 
framework of Eq.(l)-(4), by assigning to good and bad ad- 
dresses weights Wi > and Wi < respectively, to express 
their relative importance. The goal is to minimize the total 
cost, as in Eq.(l), which in this case includes both collateral 
damage gp/i and unfiltered malicious traffic bp/i. 



™^E ~ K/i)^p/i 

p/i 

S-t. ^ ^ '^p/l — F max 

p/i 

E ^p/' ^ 1 

p/l:iep/l 



WieBL 



(12) 
(13) 
(14) 



Another difference from BLOCK- ALL is Eq.(14), which dic- 
tates that every malicious source must be covered at most by 
one prefix, but does not necessarily have to be covered. 

Characterizing an Optimal Solution. We can leverage again 
the structure of the LCP tree to characterize feasible and 
optimal solutions, with a proposition similar to Prop.3.1. The 
difference from BLOCK-ALL is that, because some bad IPs 
can remain unfiltered, the pruned subtree corresponding to a 
feasible solution can now have nodes with a single descendant. 

Proposition 3.3: Given BC and F^ax, there exists an op- 
timal solution of BLOCK-SOME that can be represented as 
a pruned subtree of LCP-tree(SL) with: the same root, up to 
Fmax leaves. 

Proof: In Prop.3.1 we proved that any solution of Eq.5- 
6 can be reduced to a (pruned) subtree of the LCP-tree with 
at most Fmax leaves. Moreover, we note that constraint in 
Eq.(14), which imposes the use of non-overlapping prefixes, 
is automatically imposed considering the leaves of the pruned 
subtree as the selected filter. This prove that any feasible 
solution of BLOCK-SOME can be transformed in a pruned 
subtree of the LCP-tree with at most Fmax leaves. And thus, 
can an optimal solution. ■ 
Algorithm. The algorithm is similar to Algorithm 1 in that it 
uses the LCP-tree and a similar DP approach. The difference 
is that not all addresses need to be covered and, at each step, 
we can assign n = filters to the left or right subtree, i.e. in 
line (11) of Algorithm 1: n = 0,1..., F. We can recursively 
compute the optimal solution as before: 



^P (^) = I - ^) + I 

n=0,...,F L J 



(15) 



with boundary conditions for intermediate (p) and leaf nodes: 
Zp{0) = M p (16) 
Zp(l) = min - ^p, mm ^Zsi{l - n) + Z5^(n)|| (17) 

Zleaf{F) = -bieaf VF > 1 (18) 

Complexity. The analysis of BLOCK-ALL can be applied 
to this algorithm as well. The complexity turns out to be the 
same, i.e. linearly increasing in N as well. 

BLOCK-ALL vs. BLOCK-SOME. There is an interesting 
connection between the two problems. The latter can be 
regarded as an automatic way to select the best subset from 
BC, in terms of the weights Wi, and run BLOCK- ALL only 
on that subset. The advantage is that we do not need to search 
for the optimal subset, which is automatically given in the 
final solution. In the extreme case that much more importance 
is given to the bad rather than the good addresses, BLOCK- 
SOME degenerates to BLOCK-ALL. 

D. TIME-VARYING BLOCK-ALL(SOME) 

So far, we have considered the static problem of filter- 
ing a fixed set of source IP addresses. However, malicious 
source IPs appear/disappear/reappear in a blacklist over time 
[9]. In this section, we consider the problem of filtering a 
dynamic set of source IPs, i.e., varying over time. This is 
equivalent to considering different blacklists, one at every 
time an IP is inserted or deleted from the blacklist. Let 
us denote {BjCtq ^^Ti ^^Ti •,-••} the set of different 
blacklists as sampled at time To < Ti < • • • < < . . . , 
when a new IP is inserted in the blacklist or an old one is 
removed. The trivial approach to the dynamic BLOCK-ALL 
problem is to run Alg.l from scratch at every time instance. As 
noted the computational complexity of Alg.l is low: it grows 
linearly with the number of IP addresses in the blacklist, N. 
However, if the overlap between two successive blacklists is 
large enough, we can exploit the correlation between them 
to construct a more efficient scheme, which updates filters 
as needed, while leaving most of them unchanged. More 
formally, consider the following problem: 

Goal. Given a set of blacklists 
{BCtq , BCt^ , . . . , BCTi , . . . } collected at different times. 



To < Ti < 



and Fmax filters, find the set 



of filtering rules {Stq , , • • • , St^ , • • • } at every time such 
that, Vz = 0, 1, ... solves BLOCK- ALL(SOME) for input 
blackUst BCt,. 

Algorithm. As mentioned above, if there is no or low overlap 
between successive blacklists, the obvious solution to this 
problem is to run the BLOCK-ALL algorithm at every time 
a new blacklist is provided. Otherwise, if only few IPs are 
inserted/removed from a blacklist to the successive one, we 
can update all and only the filters affected by that change. 
For example, consider two blacklists, BCTi_i^BCTi, which 
differ only in a single new IP inserted in BCri- Assume 
that STi_i, the solution to the BLOCK- ALL problem with 
blacklist BCri^i, has already been computed. We want to find 
an efficient algorithm that computes Sri • 




Fig. 3. As an example, assume having a 6-bits IP space, instead of the usual 
32 bits. A new IP, corresponding to 37 in decimal notation, is inserted in the 
blacklist made up of IPs: 3,10,15,17,22,31,32,33,57,58. Its insertion requires 
that all and only its predecessor nodes in the LCP-tree are updated according 
to Eq.(8) (or Eq.(15) if we are running BLOCK- SOME). Moreover, a new 
node, in gray, is created to denote the longest common prefix between 37 and 
32 (or 33). Note that, all other nodes corresponding to the longest common 
prefixes between 37 and other IPs in the blacklist, is already in the initial 
LCP-tree. 



Basically, there are to two separate cases depending on 
whether or not the new IP is covered by some prefix which 
is already filtered in STi_i- If this is the case, no further 
action is needed, and Sri = ^Ti-i- Otherwise, we need to 
modify the filters to also cover the new IP. An efficient way 
to do so, is illustrated in Fig. 3. When a new IP appears in 
the blacklist, only one intermediate node needs to be added to 
the LCP-tree: the one corresponding to the longest common 
prefix between the new node the and its "closest" IP already 
in the blacklist (gray node in Fig. 3). As learnt from the 
previous sections, an optimal allocation of / filters at prefix 
p/l, depends only on how these / filters are allocated to the 
children nodes of p in the LCP-tree. Thus, the insertion of 
a new IP in the blacklist requires only the re-computation 
of Zp{f) and Xp{f)yf, through Eq.(8), for all and only the 
predecessors of the new node in the LCP-tree (nodes along the 
dashed path in Fig. 3). Multiple insertions can be handled by 
iterating the above procedures for every insertion. Handling 
removal operations (i.e. IPs that are removed from a blacklist) 
is similar: when removing an IP, we also remove its parent 
node, since it stops being the longest common prefix of two 
IPs, and we update all other predecessor nodes according to 
Eq.(8). 

We note that since any LCP-tree is also a binary tree, there 
are at most log (A/") predecessors of any leaf node, thus the 
above procedure requires 0{\og{N)Fmax) operations. This 
is a more efficient update scheme, than running Alg.l from 
scratch, if and only if the number of insert/remove operations 
that need to be performed to obtain BCn from the previous 
blacklist, BCTi_i, is less than j^^. Otherwise it is less 
expensive to simply run Alg.l with input the new blacklist, 
BCt,. 

Finally, we note that we can use the same approach to solve 
the dynamic BLOCK-SOME problem. In that case as well, 
arrivals/departures of malicious addresses from the blacklist 



can be handled by insertions/deletions in/from the LCP-tree; 
the filters should be updated accordingly so that they provide 
an optimal solution to the static BLOCK-SOME problem for 
the input blacklist at every time. 

E. FLOODING 

Goal. Given: (i) a blacklist of malicious addresses (ii) a 
set of legitimate sources (ii) the amount of traffic that each 
generates (iii) a limit on the number of filters Fmax and 
(iv) a constraint on the link capacity (bandwidth) C; select 
some source address prefixes to block so as to minimize the 
collateral damage and make the total traffic fit within the link 
capacity. 

Formulation. 

niin^^p//Xp/; (19) 
p/i 

S.t. ^ Xp/i < Fmax (20) 
p/l 

J2 {sp/i + ^p/i) - ^p/i^ ^ ^ ^^^^ 
p/i 

Xp/i < 1 yieBC (22) 

p/l:iEp/l 

where, gp/i and bp/i denote the amount of good bad traffic 
from prefix p/l, respectively. Eq.(22) indicates that we are 
interested in blocking some, not all, malicious sources, and 
that we should not use overlapping prefixes. Before the attack, 
the total good traffic to = Sp// (qp/i + bp/A could fit 
within the capacity; after flooding, the total traffic exceeds 
the capacity. Eq.(21) says that the total traffic that remains 
unblocked after filtering should fit within the link capacity C. 

Characterizing an Optimal Solution. We use the LCP tree 
for all addresses BCUQ. Furthermore, to account for Eq.(21), 
we assign a cost, tp, to every node in the LCP tree, represent- 
ing the total traffic generated by prefix p/l, tp = gp -{- bp. 

Proposition 3.4: Given BC, Q, F^ax^ and C, there exists 
an optimal solution of the FLOODING problem that can be 
represented as a pruned subtree of LCP(S>C U Q), with the 
same root, up to F^ax leaves, and s.t. the total cost of the 
leaves be > to — C. 

Proof: The proof is along the same guideline of Prop. 3. L 
It can be shown that every feasible solution of FLOODING, 
S, can be mapped in another feasible solution, S' , which i) 
correspond to a subtree of LCP-tree(Si2 U Q) as described in 
Prop. 3. 4, and ii) whose collateral damage is smaller or equal 
to the collateral damage of S. 

To see this, assume S uses a prefix p/l, which is in not in 
LCP-tree(S>C U Q). There cannot be good or bad sources in 
each of the two siblings prefixes, p/(/ + l). If this was the case, 
p/l would be their longest common prefix, and consequently 
it would appear in LCP-tree(Si2 U Q). 

Thus, there are two cases: \fp/l does not include any good 
or source we can simply remove it; otherwise we can filter 
only the branch that has some sources. Since the removed 
branch does not have active sources, the obtained solution is 



still feasible and the overall collateral damage is not increased 
(we are filtering a subset of what was already filtered). Iterating 
this process until all prefixes are in LCP-tree(S>C U Q), prove 
that any feasible solution can be interpreted as a subtree of the 
LCP-tree, where the leaves are the actual filters used. Thus, 
also an optimal feasible solution can be represented in this 
way. 

Finally, we have that, in order to have the allowed traffic 
within the capacity C, the filtered traffic, represented by the 
sum of costs tp at the subtree leaves, must be greater of equal 
than to - C ■ 

Algorithm. FLOODING is a 2-dimensional knapsack prob- 
lem (2KP), with an additional capacity constraint, Eq.(22), 
that makes it harder. 2KP is a "very hard" problem: not only 
it is NP-Hard, but also the existence of a full polynomial time 
approximation scheme for this problem is unlikely to exist, 
since it would imply that V = NV [6]. For FLOODING we 
obtain the following hardness result: 

Theorem 3.5: The optimization problem FLOODING, in 
Eq.(19)-(22), is NP-Hard. 

Proof: It is obvious that FLOODING is in AfV. To prove 
that it is also A/'T^-hard, we consider the KP problem with a 
cardinality constraint: 

max^^ PiXi, s.t^Y^WiXi < Ci and ^^Xi = k (23) 
iei iei iei 

which is known to be ATP-hard [4], and we show that it 
reduces to FLOODING. First, note that any solution of the 
above problem that uses F < F^ax filters can be transformed 
to another feasible solution with exactly Fmax filters, without 
increasing the collateral damage."^ Therefore, the inequality in 
Eq.(20) can be replaced by an equality without affecting the 
collateral damage of the optimal solution. Second, we define 
Xp/i = 1 - Xp/i, Fm^ax = (Ep// l) ^max and wc rcwritc 
the above problem: 

m^x^gp/iXp/i s.t. :^ (24) 

p/l p/l 

X] (^Qp/i + bp/i^Xp/i < C, ^ Xp/i <l\/ieBC 

p/l p/l:i^p/l 

(25) 

For a given instance of Problem (23), we construct an equiva- 
lent instance of Problem (24)-(25) by introducing the follow- 
ing mapping. For i = 1, . . . , iV: -gu = pi, {ga + ha) = Wi. 
For p/l that is not in the blacklist: gp/i = and {gp/i-\-bp/i) = 
C + 1. Moreover, we assign Fmax = k and C = Ci. With this 
assignment a solution to the KP problem (23) can be obtained 
by solving FLOODING and then taking the values of variables 
Xp/i s.t p/l is in the blacklist. ■ 
Therefore, we do not to look for a polynomial time 
algorithm. Instead, we designed a pseudo-polynomial time 

"^This can be proved using the LCP-tree structure. Given a solution, S, 
with F < Fmax filters, (until F < N) there exist always a filter that can be 
replaced by two filters, corresponding its children. The solution constructed 
in such a way has F + 1 filters, keeps on blocking all IPs blocked in S, and 
has value less or equal than the value of S. 



algorithm that optimally solves FLOODING, and whose com- 
plexity grows linearly with the number of active sources (either 
good or bad). 

Let Zp{F^c) be the minimum collateral damage solving 
FLOODING problem with F filters and capacity c: 

Zp{F,c)= min {zsi{F - n,c - m) -\- Zs^{n,m)} (26) 

n=0,...,F 
m=0,...,c 

Complexity. The DP approach computes 0{CFmax) entries 
for every node. Moreover, the computation of a single entry, 
given the entries of descendant nodes, require 0{CFmax) 
operations, Eq.(26). We can leverage again the observation 
that we do not need to compute CFmax entries for all nodes 
in the LCP tree. At a node p, it is sufficient to compute 
Eq.(26) only for c = 0, C = min{C, J2iep/i ^i} ^ ^ 
/ = 0, F. Therefore, the optimal solution to FLOODING, 
Zroot{FmaxiC), Can be computed in 0((A^ + \G\)C'^) time. 
The algorithm has pseudo-polynomial complexity since it is 
polynomial in C that cannot be bounded by the input length. 
More importantly, its complexity increases linearly with the 
number of IP sources in BC U Q. 

FLOODING vs. BLOCK-SOME. To see the connection 
between FLOODING and BLOCK-SOME, let us consider a 
partial Lagrangian relaxation of (19)-(22): 



max< 
A>o I 



p/i 



Xb, 



Xh 



p/i 

S-t. ^ ^ ^p/l ^ F^max 
p/l 

^ Xp/i < 1 Vi G B/: 

p/l:iEp/l 



)-xc) 



(27) 



(28) 



(29) 



For every fixed A > problem (27)-(29) is equivalent to (19)- 
(22) for a specific assignments of weights Wi. This shows 
that dual feasible solutions of FLOODING are instances of 
BLOCK-SOME for a particular assignment of weights. The 
dual problem, in the variable A, aims exactly at tuning the 
Lagrangian multiplier to find the best assignment of weights.^ 

F. DIST(RIBUTED)-FLOODING 

Goal: Consider a victim V that connects to the Internet 
through its ISP and is flooded by a set of attackers (listed 
in a blacklist BC), as in Fig. 1(a). To reach the victim, attack 
traffic has to pass through one or more ISP routers; let 71 be 
the set of unique routers from some attacker to the victim. Let 

^Problem (27)-(29) can be solved in a standard way with a projected 
subgradient method [4] 



(30) 
(31) 



where, a;^^j is the kih. iteratation, a;*^^(A('^)) is the optimal solution of (27)- 

(29) for A = A^'^), a/e > is the fcth step size, and [•]+ indicates the 
projection over the set of non-negative numbers. 



each router u e 71 have capacity C^^^ on the downstream link 
(towards V) and a limited number of filters F^ax • We assume 
that the volume of good/bad traffic through every router is 
known. Our goal is to allocate filters across all routers, in a 
distributed way, so as to minimize the total collateral damage 
and avoid congestion on all links of the ISP network. 

Formulation. Let the variables x^^] G {0, 1} indicate 
whether or not filter p/l is used at router u. Then the 
distributed filtering problem can be stated as: 



(u) (w) 

uen p/l 



s t V r^""^ < F^""^ 



Viz G 7^ 



(32) 



(33) 



pIi 



E W + ^/i) (1 - ^ ^^^^ ^ ^ (34) 
pji 

E E 4/i ^1 ^'^^^ (35) 

u^Tlpjl^i 

Characterizing an Optimal Solution. Given the sets BC, Q, 
71, and F^lx, C^'^^ at each router, we have: 

Proposition 3.6: There exists an optimal solution of DIST- 
FLOODING that can be represented as a set of \1Z\ different 
pruned subtrees of the LCP-tree(S>C UQ), each corresponding 
to a feasible solution of FLOODING for the same input, and 
s.t. every subtree leaf is not a node of another subtree. 

Proof. Feasible solutions of DIST-FLOODING allocate fil- 
ters on different routers s.t. Eq.(33) and (34) are satisfied 
independently at every router. In the LCP tree, this means 
having \7l\ subtrees, one for every router, each having at most 
Fmax leaves and associated blocked traffic > tl^^ — C^^\ 



(u) 

where Vq ' is the total incoming traffic at router u. Each 
subtree on its own can be thought as a feasible solution of a 
FLOODING problem. Eq.(35) ensures that the same address 
is not filtered multiple times at different routers, to avoid 
redundant waste of filters. In the LCP-tree, this translates into 
every leaf of the different subtree appearing at most in one 
subtree. ■ 
Algorithm. Constraint (35), which imposes that different 
routers do not block the same prefixes, prevents us from a 
direct decomposition of the problem. To decouple the problem, 
consider the following partial Lagrangian relaxation: 



EA.(EE-r/i-i) 



L{x,X) = Y.H^Pli'-pli 

uen p II ieBC uenp/i3i 

E(E(4;i+v)4;i)-E^' (36) 



uen p/l 



ieBC 



where is the Lagrangian multiplier (price) for the constraint 
in Eq.(35), and Xp/i = Y^iep/i ^« price associated with 

prefix p/l. With this relaxation, both the objective function 
and the other constraints inmiediately decompose in \7l\ 



independent sub-problems, one per router u\ 

p/l 

s.t.^4;]<F(ri (38) 

p/i 

p/i 

The dual problem is: 

max^/i,(A)- ^ A, (40) 

where hu{\) is the optimal solution of (37)-(39) for a given 
A. Given the prices A^, every sub-problem (37)-(39) can be 
solved independently and optimally by router u using e.g. Eq. 
(26). Problem (40) can be solved using a projected subgradient 
method, similarly to Eq.(30)-(31), as discussed in [4]. Note, 
however, that since x G {0, 1} the dual problem is not always 
guaranteed to converge to a primal feasible solution [7], [8]. 

Distributed vs. Centralized Solution. The above formulation 
lends itself naturally to a distributed implementation. Each 
router needs to only solve their own subproblem (37)-(39) 
independently from the others. A single machine (e.g. the 
victim's gateway or a dedicated node) should solve the master 
problem (40) to iteratively find the prices that coordinate all 
subproblems. Thus, at every iteration of the subgradient, the 
new Ai's need to be broadcasted to all routers. Given the A^'s, 
the routes independently solve a sub-problem each and return 
the computed x^^] to the node in charge of the master problem. 
Even in a centralized setting, our distributed scheme is efficient 
because it lends itself to parallel computation of Eq.(32)-(34). 

IV. Practical Evaluation 

The focus of this paper is the design of optimal and com- 
putationally efficient algorithms for a variety of filter selection 
problems. In this section, we use real blacklists to demonstrate 
that filter optimization brings significant gain in practice. The 
reason is that, in practice, malicious sources appear clustered 
in the IP address space, a feature that is exploited by our 
algorithms. Due to lack of space, the simulations presented 
in this section are not exhaustive. However, they demonstrate 
the above point as well as some of the structural properties 
of the solution for BLOCK- ALL and BLOCK-SOME, which 
are at the heart of this framework. As discussed in section 
III, FLOODING is essentially an instance of BLOCK-SOME 
for a particular assignment of weights and DIST-FLOODING 
consists of several FLOODING problems. 

A. Simulation Setup 

We analyzed 61 -days traces from Dshield.org [3] - a repos- 
itory of firewall and intrusion detection logs from about 2,000 
different organizations. The dataset includes 758,698,491 at- 
tack reports, from 32,950,391 different IP sources. Each report 
includes, among other things, the malicious source IP and the 
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Fig. 4. BLOCK- ALL: collateral damage (CD) normalized over the number of 
malicious sources A^) vs. number of filters Fmax- We compare Algorithm 1 
to K-means. (In particular, we simulated Lloyd's heuristic for K-means, which 
is NP-hard; we ran 50 runs to avoid local minima.) We also run Algorithm 1 
on two traces, those with the highest and lowest degree of clustering. 

victim's destination IP. By studying these logs, we verified that 
malicious sources are clustered in a few prefixes, rather than 
uniformly distributed over the IP space, which has also been 
observed by others [9]. This is an important observation in 
practice, because clustering in a blacklist means that a small 
number of filters is sufficient to block most malicious IPs at 
low collateral damage. 

We looked at each victim (individual IP destination) in 
the dataset; the set of sources attacking each victim is a 
blacklist for our simulations. This "view" varies considerably 
among victims. We also generated good traffic according 
to a realistic scenario: a domain hosting 20 servers, each 
server with average rate of 1,000 incoming good connections 
per second, each connection generating 5KB of traffic. We 
generated the good IP addresses according to the multifractal 
distribution in [10]. 

B. Simulation Results 

BLOCK-ALL. In Fig. 4, we chose two different victims, 
each attacked by large number (up to 100,000) of malicious 
IPs in a single day. We picked these particular two because 
they have the highest and the lowest degree of attack source 
clustering observed in the entire dataset. We ran Algorithm 1 
on these two blacklists and made several observations. First, 
the optimal algorithm performs significantly better than a 
generic clustering algorithm that does not exploit the structure 
of IP prefixes. In particular, it reduces the collateral damage 
(CD) by up to 85% compared to K-means, when run on 
the same (high-clustering) blacklist. Second, as expected, 
the degree of clustering in a blacklist matters. The CD is 
lowest (highest) in the blacklist with highest (lowest) degree 
of clustering, respectively. Results obtained for other victim 
destinations and days were similar and lied in between the 
two extremes. A few thousands of filters were sufficient to 
significantly reduce collateral damage (CD) in all cases. 

BLOCK-SOME. In Fig.5, we focus on the blacklist with the 
least clustering and thus the highest CD (dashed line in Fig. 4). 
In this worst case scenario, an alternative to BLOCK-ALL is 
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Fig. 5. BLOCK-SOME, (a) collateral damage (CD) (b) number of unblocked 
bad IPs (UBIP) (c) total cost {CD - W • UBIP). The operator expresses 
relative tolerance to UBIP vs. CD by tuning the weight W = We 
considered a higher (2^^) and a lower (2^'-') value of W. 

BLOCK-SOME, which allows the operator to trade-off lower 
CD for unblocked bad IPs (UBIP) by appropriately tuning 
the weights. For simplicity, in Fig. 5, we assigned the same 
weights Wg and wi^ to all good and bad sources; however, the 
framework has the flexibility to assign weights to specific IPs. 
In Fig. 5(a), the CD is always smaller than the corresponding 
CD in Fig.4; they become equal only when we block all bad 
IPs. In Fig.5(b), we see that BLOCK-SOME reduces the CD 
by 60% compared to BLOCK-ALL while leaving unblocked 
only 10% of bad IPs and using only a few hundreds a filters. 
In Fig. 5(c), the total cost decreases as Ffnax increases. As 
defined in Eq.(12), this is the weighted sum of CD and UBIP. 
However, the behavior of these two competing factors is more 
complicated and depends strongly on the input blacklist. In the 
data we analyzed we observed that CD tends to first increase 
and then decrease with Fmax, while UBIP tends to decrease.^ 
The ratio / Wg captures the effort made by BLOCK-SOME 
to block all bad IPs and become similar to BLOCK-ALL.^ 

V. Our Work in Perspective 

A. The Bigger Picture of Defense against Malicious Traffic 

Dealing with malicious traffic is a hard problem that re- 
quires the cooperation of several components. In this paper, 
we did not propose a novel solution; instead, we optimized 
the use of filtering - a mechanism that already exists on 
the Internet today and is a necessary building block of any 

^We can explain this as follows. When a new filter is available, the new 
optimal solution can be constructed by (i) blocking a new cluster of bad 
IPs (ii) splitting a blocked cluster into two filters or (iii) a combination of 
(i)&(ii)& merging of existing filters. For small Fmax, option (i) is dominant: 
the inherent clustering allows to find a cluster that is not blocked yet; this 
increases CD and reduces UBIP. When this is not possible, option (ii) becomes 
dominant, CD decreases and UBIP remains constant or decreases slowly. 

^ Since we picked a ratio wi^/wg > 1, bad IPs are more important. When 
Fmax is high, the algorithm first tries to cover small clusters or single bad 
IPs. In the case of high W, this happens around 10, 000 filters: CD remains 
almost constant in this phase, at the end of which all bad IPs are filtered (as 
in Fig. 5(b)). In the final phase, the algorithm releases single good IPs, which 
are less important and all bad IPs are blocked similarly to BLOCK-ALL. 



bigger solution. We focused on the optimal construction of 
filtering rules, which can be then installed and propagated 
by filtering protocols [11], [12]. We rely on a detection 
module, e.g. an intrusion detection system or historical data, 
to distinguish good from bad traffic and provide us with 
a blacklist. Detection is a difficult but orthogonal problem 
to the contribution of this paper. The sources of legitimate 
traffic are also assumed known, for estimating the collateral 
damage. Finally, we consider addresses in the blacklist to be 
true and not spoofed. This is reasonable today that attackers 
have the luxury to use botnets, and control a huge number 
of infected hosts for a short period of time, so that they do 
not even need to use spoofing. On 2005, less than 20% of 
addresses were spoofable [13], while in 2008, only 7% of 
addresses in Dshield logs were found likely spoofed [9]. Even 
if there is some amount of spoofed traffic, our algorithms 
treat it as the rest of malicious traffic and weight the cost 
vs. the benefit of blocking a source prefix (which may include 
both malicious spoofed and legitimate traffic). Looking into 
the future, there is also a number of proposals promising 
to enforce source accountability, including ingress filtering 
[14], self-certifying addresses [15], packet passports [16]. To 
the extent that spoofing interferes with the ability to define 
blacklists, our algorithms work best together with an anti- 
spoofing mechanism, but also do the best that can be done 
today without it. 

A practical deployment scenario is that of a single network 
under the same administrative authority, such as an ISP or 
campus network. The operator can use our algorithms to create 
filtering rules, at a single or at several routers, in order to 
optimize the use of its own resources and defend against an 
attack in a cost-efficient way. Our distributed algorithm may 
also prove useful, not only for a distributed protocol of routers 
within the same ISP, but also in the future, when different 
ISPs start cooperating against common enemies. In a different 
context, our algorithms may also be applicable to configure 
firewall rules to protect public-access networks, such as uni- 
versity campus networks or web-hosting networks; although 
firewalls are implemented in software, there is still an incentive 
to minimize the number of their rules for performance reasons. 

The following papers are related to our work. In [17], source 
filtering via ACLs was studied against DDoS attacks; however, 
the filters were heuristically selected and the approach was 
entirely simulation-based. There is a body of work on firewall 
rule configuration [18], which focuses on management and 
misconfigurations, not on resource allocation. Furthermore, 
they consider firewalls for enterprises, which are not supposed 
to be accessed from outside and thus can be protected without 
filtering rules. In our workshop paper [19], we also studied 
optimal source-based filtering by aggregating source addresses 
into continuous ranges (of numbers in [0, 2^^ — 1]) not prefixes. 
This was an easier problem that allowed for greedy solutions. 
Unfortunately, ranges are not implementable in ACLs; fur- 
thermore, it is well-known that ranges cannot be efficiently 
approximated by a combination of prefixes [5] . Therefore, 
despite the intuition we gained in [19], we had to solve the 



problem of prefix filtering from scratch in this paper. 
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B. Relation to Knapsack Problems 

The optimal filter selection belongs to the family of multi- 
dimensional knapsack problems (dKP) [4]. The general dKP 
problem is well-known to be NP-hard. The most relevant 
variation to us is the knapsack with cardinality constraint 
(1.5KP) [21], [22], which has d = 2 constraints, one of 
them being a limit on the number of items: J2jeAf^3^3 — 

C, Y^j^j^Xj < k. The 1.5KP problem is also NP-hard. 
These classic problems do not consider correlation between 

items. However, in filtering, the selection of an item (prefix) 
voids the possibility to select other items (all overlapping 
prefixes). dKP problems with correlation between items have 
been studied in [23], [24], where the items were partitioned 
into classes and up to one item per class was picked. In our 
case, a class is the set of all prefixes covering a certain address. 
Each item (prefix) can belong simultaneously to any number of 
classes, from one class (/32 address) to all classes (/O prefix). 
To the best of our knowledge, we are the first to tackle the 
case where the items belong to classes that are not a partition 
of the set of items. 

Finally, continuous relaxations do not help. Allowing Xp/i 
to be fractional corresponds to rate-limiting of prefix p/l. 
However, there is no advantage neither from a practical (rate 
limiters are more expensive than ACLs, because in addition 
to looking up packets in TCAM, they also require rate and 
computation on the fast path) nor from a theoretical point of 
view (the continuous 1.5KP is still NP-hard [25].) 

In summary, the special structure of filtering problems, i.e. 
the hierarchy and overlap of candidate prefixes, leads to novel 
variations of dKP that could not be solved by existing methods. 

VI. Conclusion 

In this paper, we introduced a formal framework to study 
filtering problems. The framework is rooted at the theory of 
the knapsack problem and provides a novel extension of it. 
Within it, we formulated five practical problems, presented in 
increasing order of complexity. For each problem, we designed 
optimal algorithms that are also low-complexity (linear in 
the input size) in practical scenarios. We also highlighted 
connections between different problems: at the heart of all 
problems hes BLOCK-SOME; BLOCK-ALL and FLOOD- 
ING are special instances for specific assignment of weights, 
and DIST-FLOODING decomposes into several independent 
FLOODING problems. Finally, we did simulations using 
Dshield traces; a key insight was that our algorithms can 
exploit the spatial clustering that is inherent in real blacklists. 

There are several directions for future work. We plan to ex- 
tend the framework to dynamically update the filtering rules as 
blacklists change over time, combine source- with destination- 
based filtering, deal with adversarial scenarios, and study the 
interaction between filtering and detection mechanisms. We 
will also provide a more extensive experimental evaluation, 
which is not the focus of this paper. 
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